An integer linear programming approach for approximate string comparison
نویسندگان
چکیده
We introduce a problem calledMaximum Common Characters in Blocks (MCCB), which arises in applications of approximate string comparison, particularly in the unification of possibly erroneous textual data coming from different sources. We show that this problem is NP-complete, but can nevertheless be solved satisfactorily using integer linear programming for instances of practical interest. Two integer linear formulations are proposed and compared in terms of their linear relaxations. We also compare the results of the approximate matching with other known measures such as the Levenshtein (edit) distance.
منابع مشابه
A Non-radial Approach for Setting Integer-valued Targets in Data Envelopment Analysis
Data Envelopment Analysis (DEA) has been widely studied in the literature since its inception with Charnes, Cooper and Rhodes work in 1978. The methodology behind the classical DEA method is to determine how much improvements in the outputs (inputs) dimensions is necessary in order to render them efficient. One of the underlying assumptions of this methodology is that the units consume and prod...
متن کاملA Non-linear Integer Bi-level Programming Model for Competitive Facility Location of Distribution Centers
The facility location problem is a strategic decision-making for a supply chain, which determines the profitability and sustainability of its components. This paper deals with a scenario where two supply chains, consisting of a producer, a number of distribution centers and several retailers provided with similar products, compete to maintain their market shares by opening new distribution cent...
متن کاملAn L1-norm method for generating all of efficient solutions of multi-objective integer linear programming problem
This paper extends the proposed method by Jahanshahloo et al. (2004) (a method for generating all the efficient solutions of a 0–1 multi-objective linear programming problem, Asia-Pacific Journal of Operational Research). This paper considers the recession direction for a multi-objective integer linear programming (MOILP) problem and presents necessary and sufficient conditions to have unbounde...
متن کاملRESOLUTION METHOD FOR MIXED INTEGER LINEAR MULTIPLICATIVE-LINEAR BILEVEL PROBLEMS BASED ON DECOMPOSITION TECHNIQUE
In this paper, we propose an algorithm base on decomposition technique for solvingthe mixed integer linear multiplicative-linear bilevel problems. In actuality, this al-gorithm is an application of the algorithm given by G. K. Saharidis et al for casethat the rst level objective function is linear multiplicative. We use properties ofquasi-concave of bilevel programming problems and decompose th...
متن کاملParameterized matching on non-linear structures
The classical pattern matching paradigm is that of seeking occurrences of one string in another, where both strings are drawn from an alphabet set Σ. In the parameterized pattern matching model, a consistent renaming of symbols from Σ is allowed in a match. The parameterized matching paradigm has proven useful in problems in software engineering, computer vision, and other applications. In clas...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- European Journal of Operational Research
دوره 198 شماره
صفحات -
تاریخ انتشار 2009